You are given a set of data on housing sale prices for the last few years in King County (near Seattle) between May 2014 and May 2015. Have a look at the variable definitions on the Kaggle page

Tidy up the data ready for regression:

library(tidyverse)
Registered S3 method overwritten by 'dplyr':
  method           from
  print.rowwise_df     
── Attaching packages ──────────────────────────────────────────────────────────── tidyverse 1.2.1 ──
✔ ggplot2 3.2.1     ✔ purrr   0.3.2
✔ tibble  2.1.3     ✔ dplyr   0.8.2
✔ tidyr   0.8.3     ✔ stringr 1.4.0
✔ readr   1.3.1     ✔ forcats 0.4.0
── Conflicts ─────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
library(readr)
library(lubridate)

Attaching package: ‘lubridate’

The following object is masked from ‘package:base’:

    date
house <- read_csv("kc_house_data.csv")
Parsed with column specification:
cols(
  .default = col_double(),
  id = col_character(),
  date = col_datetime(format = "")
)
See spec(...) for full column specifications.
summary(house)
      id                 date                         price            bedrooms        bathrooms    
 Length:21613       Min.   :2014-05-02 00:00:00   Min.   :  75000   Min.   : 0.000   Min.   :0.000  
 Class :character   1st Qu.:2014-07-22 00:00:00   1st Qu.: 321950   1st Qu.: 3.000   1st Qu.:1.750  
 Mode  :character   Median :2014-10-16 00:00:00   Median : 450000   Median : 3.000   Median :2.250  
                    Mean   :2014-10-29 04:38:01   Mean   : 540088   Mean   : 3.371   Mean   :2.115  
                    3rd Qu.:2015-02-17 00:00:00   3rd Qu.: 645000   3rd Qu.: 4.000   3rd Qu.:2.500  
                    Max.   :2015-05-27 00:00:00   Max.   :7700000   Max.   :33.000   Max.   :8.000  
  sqft_living       sqft_lot           floors        waterfront            view       
 Min.   :  290   Min.   :    520   Min.   :1.000   Min.   :0.000000   Min.   :0.0000  
 1st Qu.: 1427   1st Qu.:   5040   1st Qu.:1.000   1st Qu.:0.000000   1st Qu.:0.0000  
 Median : 1910   Median :   7618   Median :1.500   Median :0.000000   Median :0.0000  
 Mean   : 2080   Mean   :  15107   Mean   :1.494   Mean   :0.007542   Mean   :0.2343  
 3rd Qu.: 2550   3rd Qu.:  10688   3rd Qu.:2.000   3rd Qu.:0.000000   3rd Qu.:0.0000  
 Max.   :13540   Max.   :1651359   Max.   :3.500   Max.   :1.000000   Max.   :4.0000  
   condition         grade          sqft_above   sqft_basement       yr_built     yr_renovated   
 Min.   :1.000   Min.   : 1.000   Min.   : 290   Min.   :   0.0   Min.   :1900   Min.   :   0.0  
 1st Qu.:3.000   1st Qu.: 7.000   1st Qu.:1190   1st Qu.:   0.0   1st Qu.:1951   1st Qu.:   0.0  
 Median :3.000   Median : 7.000   Median :1560   Median :   0.0   Median :1975   Median :   0.0  
 Mean   :3.409   Mean   : 7.657   Mean   :1788   Mean   : 291.5   Mean   :1971   Mean   :  84.4  
 3rd Qu.:4.000   3rd Qu.: 8.000   3rd Qu.:2210   3rd Qu.: 560.0   3rd Qu.:1997   3rd Qu.:   0.0  
 Max.   :5.000   Max.   :13.000   Max.   :9410   Max.   :4820.0   Max.   :2015   Max.   :2015.0  
    zipcode           lat             long        sqft_living15    sqft_lot15    
 Min.   :98001   Min.   :47.16   Min.   :-122.5   Min.   : 399   Min.   :   651  
 1st Qu.:98033   1st Qu.:47.47   1st Qu.:-122.3   1st Qu.:1490   1st Qu.:  5100  
 Median :98065   Median :47.57   Median :-122.2   Median :1840   Median :  7620  
 Mean   :98078   Mean   :47.56   Mean   :-122.2   Mean   :1987   Mean   : 12768  
 3rd Qu.:98118   3rd Qu.:47.68   3rd Qu.:-122.1   3rd Qu.:2360   3rd Qu.: 10083  
 Max.   :98199   Max.   :47.78   Max.   :-121.3   Max.   :6210   Max.   :871200  

You might like to think about removing some or all of date, id, sqft_living15, sqft_lot15 and zipcode (lat and long provide a better measure of location in any event).

house_trim <- house %>%
  select(-id) %>%
  select(-date) %>%
  select(-zipcode) %>%
  select(-sqft_living15) %>%
  select(-sqft_lot15)

Have a think about how to treat waterfront. Should we convert its type? NO

unique(house_trim$waterfront)
[1] 0 1

We converted yr_renovated into a renovated logical variable, indicating whether the property had ever been renovated. You may wish to do the same.

house_boolean <- house_trim %>%
  mutate(renovated = as.numeric(yr_renovated != 0)) %>%
  select(-yr_renovated)

colnames(house_boolean)
 [1] "price"         "bedrooms"      "bathrooms"     "sqft_living"   "sqft_lot"      "floors"       
 [7] "waterfront"    "view"          "condition"     "grade"         "sqft_above"    "sqft_basement"
[13] "yr_built"      "lat"           "long"          "renovated"    

Have a think about how to treat condition and grade? Are they interval or categorical ordinal data types?

unique(house_boolean$condition)
[1] 3 5 4 1 2
unique(house_boolean$grade)
 [1]  7  6  8 11  9  5 10 12  4  3 13  1

Check for aliased variables using the alias() function (this takes in a formula object and a data set).

alias(price ~ ., 
      data = house_boolean)
Model :
price ~ bedrooms + bathrooms + sqft_living + sqft_lot + floors + 
    waterfront + view + condition + grade + sqft_above + sqft_basement + 
    yr_built + lat + long + renovated

Complete :
              (Intercept) bedrooms bathrooms sqft_living sqft_lot floors waterfront view condition
sqft_basement  0           0        0         1           0        0      0          0    0       
              grade sqft_above yr_built lat long renovated
sqft_basement  0    -1          0        0   0    0       

square foot living and above highly correlated to sqft_basement - let’s remove them

house_bool <- house_boolean %>%
  select(-sqft_basement) %>%
  select(-sqft_above)

Systematically build a regression model containing up to four main effects (remember, a main effect is just a single predictor with coefficient), testing the regression diagnostics as you go splitting datasets into numeric and non-numeric columns might help ggpairs() run in manageable time, although you will need to add either a price or resid column to the non-numeric dataframe in order to see its correlations with the non-numeric predictors.

library(modelr)
library(GGally)
Registered S3 method overwritten by 'GGally':
  method from   
  +.gg   ggplot2

Attaching package: ‘GGally’

The following object is masked from ‘package:dplyr’:

    nasa
house_num <- subset(house_bool, select = c("price", "bedrooms", "bathrooms", 
                                              "sqft_living", "floors", 
                                              "lat", "long"))
house_cat <- subset(house_bool, select = c("waterfront", "view", "condition", 
                                              "grade", "renovated", "price"))
ggpairs(house_num)

ggpairs(house_cat)

1st model

mod1a_sqft <- lm(price ~ sqft_living, 
                data = house_bool)

mod1b_grade <- lm(price ~ grade, 
                  data = house_bool)

summary(mod1a_sqft)

Call:
lm(formula = price ~ sqft_living, data = house_bool)

Residuals:
     Min       1Q   Median       3Q      Max 
-1476062  -147486   -24043   106182  4362067 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -43580.743   4402.690  -9.899   <2e-16 ***
sqft_living    280.624      1.936 144.920   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 261500 on 21611 degrees of freedom
Multiple R-squared:  0.4929,    Adjusted R-squared:  0.4928 
F-statistic: 2.1e+04 on 1 and 21611 DF,  p-value: < 2.2e-16
plot(mod1a_sqft)

summary(mod1b_grade)

Call:
lm(formula = price ~ grade, data = house_bool)

Residuals:
    Min      1Q  Median      3Q     Max 
-816988 -151958  -36158   97842 6046097 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -1056045      12256  -86.17   <2e-16 ***
grade         208458       1582  131.76   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 273400 on 21611 degrees of freedom
Multiple R-squared:  0.4455,    Adjusted R-squared:  0.4454 
F-statistic: 1.736e+04 on 1 and 21611 DF,  p-value: < 2.2e-16
plot(mod1b_grade)

see if log fixes it (one by one and then both):

plot_1c_log <- mod1a_sqft <- lm(log(price) ~ log(sqft_living), 
                data = house_bool)
summary(plot_1c_log)

Call:
lm(formula = log(price) ~ log(sqft_living), data = house_bool)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.10511 -0.29300  0.01262  0.25701  1.33011 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)      6.729916   0.047062   143.0   <2e-16 ***
log(sqft_living) 0.836771   0.006223   134.5   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.3886 on 21611 degrees of freedom
Multiple R-squared:  0.4555,    Adjusted R-squared:  0.4555 
F-statistic: 1.808e+04 on 1 and 21611 DF,  p-value: < 2.2e-16

not really…

cor(house_bool$sqft_living, house_bool$grade)
[1] 0.7627045

high correlation between the sqft_living and the grade. Grade will not be added to the model

Let’s check with anova

mod1a_sqft_grade <- lm(price ~ sqft_living + grade, 
                       data = house_bool)
anova(mod1a_sqft, mod1a_sqft_grade)
models with response ‘"price"’ removed because response differs from model 1
Analysis of Variance Table

Response: log(price)
                    Df Sum Sq Mean Sq F value    Pr(>F)    
log(sqft_living)     1 2730.8 2730.81   18079 < 2.2e-16 ***
Residuals        21611 3264.3    0.15                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

We could use it

summary(mod1a_sqft_grade)

Call:
lm(formula = price ~ sqft_living + grade, data = house_bool)

Residuals:
     Min       1Q   Median       3Q      Max 
-1065457  -138304   -25043   100447  4794633 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -5.981e+05  1.330e+04  -44.98   <2e-16 ***
sqft_living  1.844e+02  2.869e+00   64.29   <2e-16 ***
grade        9.855e+04  2.241e+03   43.97   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 250500 on 21610 degrees of freedom
Multiple R-squared:  0.5345,    Adjusted R-squared:  0.5345 
F-statistic: 1.241e+04 on 2 and 21610 DF,  p-value: < 2.2e-16

but not much improvement in the R-squared

house_num_2 <- house_num %>%
  add_residuals(mod1a_sqft) %>%
  select(-c("price", "sqft_living"))
ggpairs(house_num_2)

house_cat_2 <- house_bool %>%
  add_residuals(mod1a_sqft) %>%
  select(-c("price", "bedrooms", "bathrooms", "sqft_living", 
            "floors", "lat", "long", "grade"))
ggpairs(house_cat_2)

NA

Latitude and view have the best correlation with the residuals:

mod2_sqft_lat <- lm(price ~ sqft_living + lat, 
                    data = house_bool)

mod2_sqft_view <- lm(price ~ sqft_living + view, 
                     data = house_bool)

summary(mod2_sqft_lat)

Call:
lm(formula = price ~ sqft_living + lat, data = house_bool)

Residuals:
     Min       1Q   Median       3Q      Max 
-1487994  -125643   -20309    84613  4368717 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -3.416e+07  5.653e+05  -60.44   <2e-16 ***
sqft_living  2.749e+02  1.794e+00  153.27   <2e-16 ***
lat          7.177e+05  1.189e+04   60.36   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 241900 on 21610 degrees of freedom
Multiple R-squared:  0.566, Adjusted R-squared:  0.566 
F-statistic: 1.409e+04 on 2 and 21610 DF,  p-value: < 2.2e-16
plot(mod2_sqft_lat)

summary(mod2_sqft_view)

Call:
lm(formula = price ~ sqft_living + view, data = house_bool)

Residuals:
     Min       1Q   Median       3Q      Max 
-1583570  -139453   -19322   104245  4321083 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -16853.463   4257.279  -3.959 7.56e-05 ***
sqft_living    256.176      1.934 132.485  < 2e-16 ***
view        102951.162   2317.467  44.424  < 2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 250300 on 21610 degrees of freedom
Multiple R-squared:  0.5353,    Adjusted R-squared:  0.5352 
F-statistic: 1.245e+04 on 2 and 21610 DF,  p-value: < 2.2e-16
plot(mod2_sqft_view)

Looks like latitude is the best Is there a correlation between latitude and view?

cor(house_bool$lat, house_bool$view)
[1] 0.006156732

not really - let see if we can improve the model with the view:

mod3_sqft_lat_view <- lm(price ~ sqft_living + lat + view, 
                    data = house_bool)
summary(mod3_sqft_lat_view)

Call:
lm(formula = price ~ sqft_living + lat + view, data = house_bool)

Residuals:
     Min       1Q   Median       3Q      Max 
-1596893  -115223   -14728    82111  4327282 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -3.438e+07  5.363e+05  -64.11   <2e-16 ***
sqft_living  2.502e+02  1.775e+00  140.93   <2e-16 ***
lat          7.228e+05  1.128e+04   64.08   <2e-16 ***
view         1.042e+05  2.125e+03   49.05   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 229400 on 21609 degrees of freedom
Multiple R-squared:  0.6095,    Adjusted R-squared:  0.6094 
F-statistic: 1.124e+04 on 3 and 21609 DF,  p-value: < 2.2e-16

Hurray!! Let see what else to use

house_3 <- house_bool %>%
  add_residuals(mod3_sqft_lat_view) %>%
  select(-c("price", "grade", "lat", "view"))
ggpairs(house_3)

Waterfront looks good so let’s go with that

mod_4 <- lm(price ~ sqft_living + lat + view + waterfront,  
                    data = house_bool)
summary(mod_4)

Call:
lm(formula = price ~ sqft_living + lat + view + waterfront, data = house_bool)

Residuals:
     Min       1Q   Median       3Q      Max 
-1497101  -113377   -14143    82029  4400900 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -3.467e+07  5.249e+05  -66.05   <2e-16 ***
sqft_living  2.507e+02  1.737e+00  144.35   <2e-16 ***
lat          7.288e+05  1.104e+04   66.02   <2e-16 ***
view         7.691e+04  2.258e+03   34.06   <2e-16 ***
waterfront   5.969e+05  1.928e+04   30.95   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 224500 on 21608 degrees of freedom
Multiple R-squared:  0.6261,    Adjusted R-squared:  0.626 
F-statistic:  9045 on 4 and 21608 DF,  p-value: < 2.2e-16

not much added value - let’s try with the next best :year_build

mod_4b <- lm(price ~ sqft_living + lat + view + yr_built,  
                    data = house_bool)
summary(mod_4b)

Call:
lm(formula = price ~ sqft_living + lat + view + yr_built, data = house_bool)

Residuals:
     Min       1Q   Median       3Q      Max 
-1708731  -111155   -10443    84211  4129084 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)    
(Intercept) -2.941e+07  5.681e+05  -51.77   <2e-16 ***
sqft_living  2.664e+02  1.878e+00  141.87   <2e-16 ***
lat          6.744e+05  1.131e+04   59.62   <2e-16 ***
view         9.589e+04  2.125e+03   45.13   <2e-16 ***
yr_built    -1.370e+03  5.692e+01  -24.06   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 226400 on 21608 degrees of freedom
Multiple R-squared:  0.6197,    Adjusted R-squared:  0.6196 
F-statistic:  8802 on 4 and 21608 DF,  p-value: < 2.2e-16

nope! mod_4 is still the best

2 Extensions Consider possible interactions between your four main effect predictors and test their effect upon r2. Choose your best candidate interaction and visualise its effect. mod_4 <- lm(price ~ sqft_living + lat + view + waterfront,
data = house_bool)

mod5_a <- lm(price ~ sqft_living + lat + view + waterfront + sqft_living:waterfront,  
                    data = house_bool)
mod5_b <- lm(price ~ sqft_living + lat + view + waterfront + waterfront:view,  
                    data = house_bool)

summary(mod5_a)

Call:
lm(formula = price ~ sqft_living + lat + view + waterfront + 
    sqft_living:waterfront, data = house_bool)

Residuals:
     Min       1Q   Median       3Q      Max 
-1418430  -111833   -15343    78675  4479310 

Coefficients:
                         Estimate Std. Error t value Pr(>|t|)    
(Intercept)            -3.431e+07  5.130e+05  -66.89   <2e-16 ***
sqft_living             2.422e+02  1.718e+00  140.98   <2e-16 ***
lat                     7.217e+05  1.079e+04   66.89   <2e-16 ***
view                    7.944e+04  2.208e+03   35.98   <2e-16 ***
waterfront             -5.064e+05  3.930e+04  -12.89   <2e-16 ***
sqft_living:waterfront  3.477e+02  1.087e+01   31.99   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 219400 on 21607 degrees of freedom
Multiple R-squared:  0.643, Adjusted R-squared:  0.6429 
F-statistic:  7783 on 5 and 21607 DF,  p-value: < 2.2e-16
summary(mod5_b)

Call:
lm(formula = price ~ sqft_living + lat + view + waterfront + 
    waterfront:view, data = house_bool)

Residuals:
     Min       1Q   Median       3Q      Max 
-1496810  -113388   -14067    81993  4401105 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)     -3.466e+07  5.249e+05 -66.038  < 2e-16 ***
sqft_living      2.508e+02  1.737e+00 144.345  < 2e-16 ***
lat              7.287e+05  1.104e+04  66.009  < 2e-16 ***
view             7.682e+04  2.263e+03  33.940  < 2e-16 ***
waterfront       5.278e+05  1.196e+05   4.413 1.02e-05 ***
view:waterfront  1.843e+04  3.147e+04   0.586    0.558    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 224500 on 21607 degrees of freedom
Multiple R-squared:  0.6261,    Adjusted R-squared:  0.626 
F-statistic:  7236 on 5 and 21607 DF,  p-value: < 2.2e-16
house_bool %>%
  ggplot(aes(x = sqft_living, y = price, colour = waterfront)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE)

Calculate the relative importance of predictors from your best 4-predictor model (i.e. the model without an interaction). Which predictor affects price most strongly?

library(relaimpo)
calc.relimp(mod_4, type = "lmg", rela = TRUE)
Response variable: price 
Total response variance: 134782378397 
Analysis based on 21613 observations 

4 Regressors: 
sqft_living lat view waterfront 
Proportion of variance explained by model: 62.61%
Metrics are normalized to sum to 100% (rela=TRUE). 

Relative importance metrics: 

                   lmg
sqft_living 0.67161080
lat         0.13512450
view        0.13235564
waterfront  0.06090906

Average coefficients for different model sizes: 

                      1X         2Xs         3Xs         4Xs
sqft_living     280.6236    267.8727    257.8186    250.7462
lat          813411.5832 782766.2372 754623.2513 728807.2243
view         190335.2479 152740.5976 114877.1645  76912.3892
waterfront  1130312.4247 839722.1754 664548.3041 596863.2631

sqft_living affects the price most strongly

LS0tCnRpdGxlOiAiUiBOb3RlYm9vayIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKWW91IGFyZSBnaXZlbiBhIHNldCBvZiBkYXRhIG9uIGhvdXNpbmcgc2FsZSBwcmljZXMgZm9yIHRoZSBsYXN0IGZldyB5ZWFycyBpbiBLaW5nIENvdW50eSAobmVhciBTZWF0dGxlKSBiZXR3ZWVuIE1heSAyMDE0IGFuZCBNYXkgMjAxNS4gSGF2ZSBhIGxvb2sgYXQgdGhlIHZhcmlhYmxlIGRlZmluaXRpb25zIG9uIHRoZSBLYWdnbGUgcGFnZQoKVGlkeSB1cCB0aGUgZGF0YSByZWFkeSBmb3IgcmVncmVzc2lvbjoKYGBge3J9CmxpYnJhcnkodGlkeXZlcnNlKQpsaWJyYXJ5KHJlYWRyKQpsaWJyYXJ5KGx1YnJpZGF0ZSkKYGBgCgpgYGB7cn0KaG91c2UgPC0gcmVhZF9jc3YoImtjX2hvdXNlX2RhdGEuY3N2IikKc3VtbWFyeShob3VzZSkKYGBgCgoKWW91IG1pZ2h0IGxpa2UgdG8gdGhpbmsgYWJvdXQgcmVtb3Zpbmcgc29tZSBvciBhbGwgb2YgZGF0ZSwgaWQsIHNxZnRfbGl2aW5nMTUsIHNxZnRfbG90MTUgYW5kIHppcGNvZGUgKGxhdCBhbmQgbG9uZyBwcm92aWRlIGEgYmV0dGVyIG1lYXN1cmUgb2YgbG9jYXRpb24gaW4gYW55IGV2ZW50KS4KYGBge3J9CmhvdXNlX3RyaW0gPC0gaG91c2UgJT4lCiAgc2VsZWN0KC1pZCkgJT4lCiAgc2VsZWN0KC1kYXRlKSAlPiUKICBzZWxlY3QoLXppcGNvZGUpICU+JQogIHNlbGVjdCgtc3FmdF9saXZpbmcxNSkgJT4lCiAgc2VsZWN0KC1zcWZ0X2xvdDE1KQpgYGAKCkhhdmUgYSB0aGluayBhYm91dCBob3cgdG8gdHJlYXQgd2F0ZXJmcm9udC4gU2hvdWxkIHdlIGNvbnZlcnQgaXRzIHR5cGU/IE5PCmBgYHtyfQp1bmlxdWUoaG91c2VfdHJpbSR3YXRlcmZyb250KQpgYGAKCldlIGNvbnZlcnRlZCB5cl9yZW5vdmF0ZWQgaW50byBhIHJlbm92YXRlZCBsb2dpY2FsIHZhcmlhYmxlLCBpbmRpY2F0aW5nIHdoZXRoZXIgdGhlIHByb3BlcnR5IGhhZCBldmVyIGJlZW4gcmVub3ZhdGVkLiBZb3UgbWF5IHdpc2ggdG8gZG8gdGhlIHNhbWUuCmBgYHtyfQpob3VzZV9ib29sZWFuIDwtIGhvdXNlX3RyaW0gJT4lCiAgbXV0YXRlKHJlbm92YXRlZCA9IGFzLm51bWVyaWMoeXJfcmVub3ZhdGVkICE9IDApKSAlPiUKICBzZWxlY3QoLXlyX3Jlbm92YXRlZCkKCmNvbG5hbWVzKGhvdXNlX2Jvb2xlYW4pCmBgYAoKSGF2ZSBhIHRoaW5rIGFib3V0IGhvdyB0byB0cmVhdCBjb25kaXRpb24gYW5kIGdyYWRlPyBBcmUgdGhleSBpbnRlcnZhbCBvciBjYXRlZ29yaWNhbCBvcmRpbmFsIGRhdGEgdHlwZXM/CmBgYHtyfQp1bmlxdWUoaG91c2VfYm9vbGVhbiRjb25kaXRpb24pCnVuaXF1ZShob3VzZV9ib29sZWFuJGdyYWRlKQpgYGAKCkNoZWNrIGZvciBhbGlhc2VkIHZhcmlhYmxlcyB1c2luZyB0aGUgYWxpYXMoKSBmdW5jdGlvbiAodGhpcyB0YWtlcyBpbiBhIGZvcm11bGEgb2JqZWN0IGFuZCBhIGRhdGEgc2V0KS4gCmBgYHtyfQphbGlhcyhwcmljZSB+IC4sIAogICAgICBkYXRhID0gaG91c2VfYm9vbGVhbikKYGBgCnNxdWFyZSBmb290IGxpdmluZyBhbmQgYWJvdmUgaGlnaGx5IGNvcnJlbGF0ZWQgdG8gc3FmdF9iYXNlbWVudCAtIGxldCdzIApyZW1vdmUgdGhlbQpgYGB7cn0KaG91c2VfYm9vbCA8LSBob3VzZV9ib29sZWFuICU+JQogIHNlbGVjdCgtc3FmdF9iYXNlbWVudCkgJT4lCiAgc2VsZWN0KC1zcWZ0X2Fib3ZlKQpgYGAKCgpTeXN0ZW1hdGljYWxseSBidWlsZCBhIHJlZ3Jlc3Npb24gbW9kZWwgY29udGFpbmluZyB1cCB0byBmb3VyIG1haW4gZWZmZWN0cyAocmVtZW1iZXIsIGEgbWFpbiBlZmZlY3QgaXMganVzdCBhIHNpbmdsZSBwcmVkaWN0b3Igd2l0aCBjb2VmZmljaWVudCksIHRlc3RpbmcgdGhlIHJlZ3Jlc3Npb24gZGlhZ25vc3RpY3MgYXMgeW91IGdvIHNwbGl0dGluZyBkYXRhc2V0cyBpbnRvIG51bWVyaWMgYW5kIG5vbi1udW1lcmljIGNvbHVtbnMgbWlnaHQgaGVscCBnZ3BhaXJzKCkgcnVuIGluIG1hbmFnZWFibGUgdGltZSwgYWx0aG91Z2ggeW91IHdpbGwgbmVlZCB0byBhZGQgZWl0aGVyIGEgcHJpY2Ugb3IgcmVzaWQgY29sdW1uIHRvIHRoZSBub24tbnVtZXJpYyBkYXRhZnJhbWUgaW4gb3JkZXIgdG8gc2VlIGl0cyBjb3JyZWxhdGlvbnMgd2l0aCB0aGUgbm9uLW51bWVyaWMgcHJlZGljdG9ycy4KYGBge3J9CmxpYnJhcnkobW9kZWxyKQpsaWJyYXJ5KEdHYWxseSkKaG91c2VfbnVtIDwtIHN1YnNldChob3VzZV9ib29sLCBzZWxlY3QgPSBjKCJwcmljZSIsICJiZWRyb29tcyIsICJiYXRocm9vbXMiLCAKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICJzcWZ0X2xpdmluZyIsICJmbG9vcnMiLCAKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICJsYXQiLCAibG9uZyIpKQpob3VzZV9jYXQgPC0gc3Vic2V0KGhvdXNlX2Jvb2wsIHNlbGVjdCA9IGMoIndhdGVyZnJvbnQiLCAidmlldyIsICJjb25kaXRpb24iLCAKICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICJncmFkZSIsICJyZW5vdmF0ZWQiLCAicHJpY2UiKSkKCmBgYAoKYGBge3J9CmdncGFpcnMoaG91c2VfbnVtKQpgYGAKCmBgYHtyfQpnZ3BhaXJzKGhvdXNlX2NhdCkKYGBgCgoxc3QgbW9kZWwKYGBge3J9Cm1vZDFhX3NxZnQgPC0gbG0ocHJpY2UgfiBzcWZ0X2xpdmluZywgCiAgICAgICAgICAgICAgICBkYXRhID0gaG91c2VfYm9vbCkKCm1vZDFiX2dyYWRlIDwtIGxtKHByaWNlIH4gZ3JhZGUsIAogICAgICAgICAgICAgICAgICBkYXRhID0gaG91c2VfYm9vbCkKCnN1bW1hcnkobW9kMWFfc3FmdCkKcGxvdChtb2QxYV9zcWZ0KQpgYGAKCmBgYHtyfQpzdW1tYXJ5KG1vZDFiX2dyYWRlKQpwbG90KG1vZDFiX2dyYWRlKQpgYGAKCnNlZSBpZiBsb2cgZml4ZXMgaXQgKG9uZSBieSBvbmUgYW5kIHRoZW4gYm90aCk6CmBgYHtyfQpwbG90XzFjX2xvZyA8LSBtb2QxYV9zcWZ0IDwtIGxtKGxvZyhwcmljZSkgfiBsb2coc3FmdF9saXZpbmcpLCAKICAgICAgICAgICAgICAgIGRhdGEgPSBob3VzZV9ib29sKQpzdW1tYXJ5KHBsb3RfMWNfbG9nKQpgYGAKbm90IHJlYWxseS4uLgoKYGBge3J9CmNvcihob3VzZV9ib29sJHNxZnRfbGl2aW5nLCBob3VzZV9ib29sJGdyYWRlKQpgYGAKaGlnaCBjb3JyZWxhdGlvbiBiZXR3ZWVuIHRoZSBzcWZ0X2xpdmluZyBhbmQgdGhlIGdyYWRlLiBHcmFkZSB3aWxsIG5vdCBiZSBhZGRlZCB0byB0aGUgbW9kZWwKCkxldCdzIGNoZWNrIHdpdGggYW5vdmEKYGBge3J9Cm1vZDFhX3NxZnRfZ3JhZGUgPC0gbG0ocHJpY2UgfiBzcWZ0X2xpdmluZyArIGdyYWRlLCAKICAgICAgICAgICAgICAgICAgICAgICBkYXRhID0gaG91c2VfYm9vbCkKYW5vdmEobW9kMWFfc3FmdCwgbW9kMWFfc3FmdF9ncmFkZSkKYGBgCgpXZSBjb3VsZCB1c2UgaXQKYGBge3J9CnN1bW1hcnkobW9kMWFfc3FmdF9ncmFkZSkKYGBgCgpidXQgbm90IG11Y2ggaW1wcm92ZW1lbnQgaW4gdGhlIFItc3F1YXJlZAoKYGBge3J9CmhvdXNlX251bV8yIDwtIGhvdXNlX251bSAlPiUKICBhZGRfcmVzaWR1YWxzKG1vZDFhX3NxZnQpICU+JQogIHNlbGVjdCgtYygicHJpY2UiLCAic3FmdF9saXZpbmciKSkKZ2dwYWlycyhob3VzZV9udW1fMikKCmBgYAoKYGBge3J9CmhvdXNlX2NhdF8yIDwtIGhvdXNlX2Jvb2wgJT4lCiAgYWRkX3Jlc2lkdWFscyhtb2QxYV9zcWZ0KSAlPiUKICBzZWxlY3QoLWMoInByaWNlIiwgImJlZHJvb21zIiwgImJhdGhyb29tcyIsICJzcWZ0X2xpdmluZyIsIAogICAgICAgICAgICAiZmxvb3JzIiwgImxhdCIsICJsb25nIiwgImdyYWRlIikpCmdncGFpcnMoaG91c2VfY2F0XzIpCiAgCmBgYAoKTGF0aXR1ZGUgYW5kIHZpZXcgaGF2ZSB0aGUgYmVzdCBjb3JyZWxhdGlvbiB3aXRoIHRoZSByZXNpZHVhbHM6IApgYGB7cn0KbW9kMl9zcWZ0X2xhdCA8LSBsbShwcmljZSB+IHNxZnRfbGl2aW5nICsgbGF0LCAKICAgICAgICAgICAgICAgICAgICBkYXRhID0gaG91c2VfYm9vbCkKCm1vZDJfc3FmdF92aWV3IDwtIGxtKHByaWNlIH4gc3FmdF9saXZpbmcgKyB2aWV3LCAKICAgICAgICAgICAgICAgICAgICAgZGF0YSA9IGhvdXNlX2Jvb2wpCgpzdW1tYXJ5KG1vZDJfc3FmdF9sYXQpCnBsb3QobW9kMl9zcWZ0X2xhdCkKYGBgCgpgYGB7cn0Kc3VtbWFyeShtb2QyX3NxZnRfdmlldykKcGxvdChtb2QyX3NxZnRfdmlldykKYGBgCgpMb29rcyBsaWtlIGxhdGl0dWRlIGlzIHRoZSBiZXN0CklzIHRoZXJlIGEgY29ycmVsYXRpb24gYmV0d2VlbiBsYXRpdHVkZSBhbmQgdmlldz8KYGBge3J9CmNvcihob3VzZV9ib29sJGxhdCwgaG91c2VfYm9vbCR2aWV3KQpgYGAKCm5vdCByZWFsbHkgLSBsZXQgc2VlIGlmIHdlIGNhbiBpbXByb3ZlIHRoZSBtb2RlbCB3aXRoIHRoZSB2aWV3OiAKYGBge3J9Cm1vZDNfc3FmdF9sYXRfdmlldyA8LSBsbShwcmljZSB+IHNxZnRfbGl2aW5nICsgbGF0ICsgdmlldywgCiAgICAgICAgICAgICAgICAgICAgZGF0YSA9IGhvdXNlX2Jvb2wpCnN1bW1hcnkobW9kM19zcWZ0X2xhdF92aWV3KQpgYGAKCkh1cnJheSEhCkxldCBzZWUgd2hhdCBlbHNlIHRvIHVzZQpgYGB7cn0KaG91c2VfMyA8LSBob3VzZV9ib29sICU+JQogIGFkZF9yZXNpZHVhbHMobW9kM19zcWZ0X2xhdF92aWV3KSAlPiUKICBzZWxlY3QoLWMoInByaWNlIiwgImdyYWRlIiwgImxhdCIsICJ2aWV3IikpCmdncGFpcnMoaG91c2VfMykKYGBgCldhdGVyZnJvbnQgbG9va3MgZ29vZCBzbyBsZXQncyBnbyB3aXRoIHRoYXQKYGBge3J9Cm1vZF80IDwtIGxtKHByaWNlIH4gc3FmdF9saXZpbmcgKyBsYXQgKyB2aWV3ICsgd2F0ZXJmcm9udCwgIAogICAgICAgICAgICAgICAgICAgIGRhdGEgPSBob3VzZV9ib29sKQpzdW1tYXJ5KG1vZF80KQpgYGAKCm5vdCBtdWNoIGFkZGVkIHZhbHVlIC0gbGV0J3MgdHJ5IHdpdGggdGhlIG5leHQgYmVzdCA6eWVhcl9idWlsZApgYGB7cn0KbW9kXzRiIDwtIGxtKHByaWNlIH4gc3FmdF9saXZpbmcgKyBsYXQgKyB2aWV3ICsgeXJfYnVpbHQsICAKICAgICAgICAgICAgICAgICAgICBkYXRhID0gaG91c2VfYm9vbCkKc3VtbWFyeShtb2RfNGIpCmBgYAoKbm9wZSEgbW9kXzQgaXMgc3RpbGwgdGhlIGJlc3QKCgoyIEV4dGVuc2lvbnMKQ29uc2lkZXIgcG9zc2libGUgaW50ZXJhY3Rpb25zIGJldHdlZW4geW91ciBmb3VyIG1haW4gZWZmZWN0IHByZWRpY3RvcnMgYW5kIHRlc3QgdGhlaXIgZWZmZWN0IHVwb24gcjIuIENob29zZSB5b3VyIGJlc3QgY2FuZGlkYXRlIGludGVyYWN0aW9uIGFuZCB2aXN1YWxpc2UgaXRzIGVmZmVjdC4KbW9kXzQgPC0gbG0ocHJpY2UgfiBzcWZ0X2xpdmluZyArIGxhdCArIHZpZXcgKyB3YXRlcmZyb250LCAgCiAgICAgICAgICAgICAgICAgICAgZGF0YSA9IGhvdXNlX2Jvb2wpCmBgYHtyfQptb2Q1X2EgPC0gbG0ocHJpY2UgfiBzcWZ0X2xpdmluZyArIGxhdCArIHZpZXcgKyB3YXRlcmZyb250ICsgc3FmdF9saXZpbmc6d2F0ZXJmcm9udCwgIAogICAgICAgICAgICAgICAgICAgIGRhdGEgPSBob3VzZV9ib29sKQptb2Q1X2IgPC0gbG0ocHJpY2UgfiBzcWZ0X2xpdmluZyArIGxhdCArIHZpZXcgKyB3YXRlcmZyb250ICsgd2F0ZXJmcm9udDp2aWV3LCAgCiAgICAgICAgICAgICAgICAgICAgZGF0YSA9IGhvdXNlX2Jvb2wpCgpzdW1tYXJ5KG1vZDVfYSkKc3VtbWFyeShtb2Q1X2IpCgpgYGAKICAgICAgICAgICAgICAgICAgICAKYGBge3J9CmhvdXNlX2Jvb2wgJT4lCiAgZ2dwbG90KGFlcyh4ID0gc3FmdF9saXZpbmcsIHkgPSBwcmljZSwgY29sb3VyID0gd2F0ZXJmcm9udCkpICsKICBnZW9tX3BvaW50KCkgKwogIGdlb21fc21vb3RoKG1ldGhvZCA9ICJsbSIsIHNlID0gRkFMU0UpCgpgYGAKCgpDYWxjdWxhdGUgdGhlIHJlbGF0aXZlIGltcG9ydGFuY2Ugb2YgcHJlZGljdG9ycyBmcm9tIHlvdXIgYmVzdCA0LXByZWRpY3RvciBtb2RlbCAoaS5lLiB0aGUgbW9kZWwgd2l0aG91dCBhbiBpbnRlcmFjdGlvbikuIFdoaWNoIHByZWRpY3RvciBhZmZlY3RzIHByaWNlIG1vc3Qgc3Ryb25nbHk/CmBgYHtyfQpsaWJyYXJ5KHJlbGFpbXBvKQpjYWxjLnJlbGltcChtb2RfNCwgdHlwZSA9ICJsbWciLCByZWxhID0gVFJVRSkKCmBgYAoKCnNxZnRfbGl2aW5nIGFmZmVjdHMgdGhlIHByaWNlIG1vc3Qgc3Ryb25nbHk=